Introduction

This coursework focuses on housing prices, with the main objective being to predict the price of a property based on various inputs. The inputs include features such as the area, the number and types of rooms, and additional factors like the availability of a main road, hot water heating, and more.

The dependent variable is the price, as it is the primary concern for most people searching for a house. The goal of this work is to predict the price based on diverse inputs, which consist of mixed data types, such as:

  • Numerical values
  • Text-based responses like “yes” or “no”
  • Categories for furnishing status, including “furnished,” “semi-furnished,” or “non-furnished.”

This project addresses a regression problem because the objective is to predict a numeric value—in this case, the price of the property.

Collection / Preparation

Now we are going to import our dataset into this project.

dt_houses <- fread(file = "./datasets/Regression_set.csv")


I would like to check, if i have some nullish data in my dataset. I think it is a good idea to go through all rows and colums and check, if there is a NA. I want to check it with built-in function in R complete.cases(data_table). This function returns TRUE or FALSE if row contains a NA value.

nas <- dt_houses[!complete.cases(dt_houses)]
nas

That looks great, now we can explore our dataset :)

Exploration

Before we will explore our data, I want to import all libraries, which we will probably use:

library(data.table)
library(ggcorrplot)
library(ggExtra)
library(ggplot2)
library(ggridges)
library(ggsci)
library(ggthemes)
library(RColorBrewer)
library(svglite)
library(viridis)
library(scales)
library(rpart)
library(rpart.plot)

I found some helpful functions in R, so we could have a look on our data. We will start with a structure, than we will get some statistic data and take a head() of the data

str(dt_houses)
Classes ‘data.table’ and 'data.frame':  545 obs. of  13 variables:
 $ price           : int  13300000 12250000 12250000 12215000 11410000 10850000 10150000 10150000 9870000 9800000 ...
 $ area            : int  7420 8960 9960 7500 7420 7500 8580 16200 8100 5750 ...
 $ bedrooms        : int  4 4 3 4 4 3 4 5 4 3 ...
 $ bathrooms       : int  2 4 2 2 1 3 3 3 1 2 ...
 $ stories         : int  3 4 2 2 2 1 4 2 2 4 ...
 $ mainroad        : chr  "yes" "yes" "yes" "yes" ...
 $ guestroom       : chr  "no" "no" "no" "no" ...
 $ basement        : chr  "no" "no" "yes" "yes" ...
 $ hotwaterheating : chr  "no" "no" "no" "no" ...
 $ airconditioning : chr  "yes" "yes" "no" "yes" ...
 $ parking         : int  2 3 2 3 2 2 2 0 2 1 ...
 $ prefarea        : chr  "yes" "no" "yes" "yes" ...
 $ furnishingstatus: chr  "furnished" "furnished" "semi-furnished" "furnished" ...
 - attr(*, ".internal.selfref")=<externalptr> 


Statistic data:

summary(dt_houses[, .(price, area, bedrooms, bathrooms, stories, parking)])
     price               area          bedrooms       bathrooms        stories         parking      
 Min.   : 1750000   Min.   : 1650   Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :0.0000  
 1st Qu.: 3430000   1st Qu.: 3600   1st Qu.:2.000   1st Qu.:1.000   1st Qu.:1.000   1st Qu.:0.0000  
 Median : 4340000   Median : 4600   Median :3.000   Median :1.000   Median :2.000   Median :0.0000  
 Mean   : 4766729   Mean   : 5151   Mean   :2.965   Mean   :1.286   Mean   :1.806   Mean   :0.6936  
 3rd Qu.: 5740000   3rd Qu.: 6360   3rd Qu.:3.000   3rd Qu.:2.000   3rd Qu.:2.000   3rd Qu.:1.0000  
 Max.   :13300000   Max.   :16200   Max.   :6.000   Max.   :4.000   Max.   :4.000   Max.   :3.0000  


and this is a sample of our dataset:

head(dt_houses)

I would like to start from density of a main values, which are from my domain knowledge are important in price of the properties

Price density:

ggplot(data = dt_houses, aes(x = price)) + 
  geom_density(fill="#f1b147", color="#f1b147", alpha=0.25) + 
  labs(
    x = 'Price',
    y = 'Density'
  ) +
  geom_vline(xintercept = mean(dt_houses$price), linetype="dashed") + 
  scale_x_continuous(labels = label_number(scale = 1e-6, suffix = "M")) + 
  theme_minimal() + 
  theme(axis.line = element_line(color = "#000000"))

It is very clear, that most of the prices are between 0 and ~ 5 million.

Area density:

ggplot(data = dt_houses, aes(x = area)) + 
  geom_density(fill="#f1b147", color="#f1b147", alpha=0.25) + 
  labs(
    x = 'Price',
    y = 'Density'
  ) +
  theme_minimal() + 
  theme(axis.line = element_line(color = "#000000"))

Area density looks a little bit more centered, but still skewed to the left.


How does area affect price of the house? We will plot it with points, where price is on the y-axis and area on x-axis.

ggplot() + 
  geom_point(data = dt_houses, aes(x = area, y = price, color = parking)) +
  scale_y_continuous(labels = label_number(scale = 1e-6, suffix = "M")) + 
  theme_minimal() + 
  theme(axis.line = element_line(color = "#000000"))

This looks nice, and it is also logical, more space, higher price. But if we take a look at parking places, there is hard to see a trend.

But, now I have the simplest idea, how does amount of bedrooms correlates with the price.

ggplot(data = dt_houses, aes(x = factor(bedrooms), y = price)) +
  geom_boxplot() + 
  theme_minimal() 

We can see, that on average, more bedrooms, means higher price, but I think there is not really strong relationship between this two variables.

Also it would be great to take a look at a bedrooms histogram:

ggplot(data = dt_houses, aes(x = bedrooms)) + 
  geom_histogram(fill="#2f9e44", color="#2f9e44", alpha=0.25) + 
  geom_vline(xintercept = mean(dt_houses$bedrooms), linetype="dashed") + 
  theme_minimal() + 
  theme(axis.line = element_line(color = "#000000"))

mean of the bedrooms:

mean(dt_houses$bedrooms)
[1] 2.965138

Here we can see, that the most of the properties tend to have 2, 3 or 4 rooms.

Let’s have a look at histogram of stories:

ggplot(data = dt_houses, aes(x = stories)) + 
  geom_histogram(fill="#2f9e44", color="#2f9e44", alpha=0.25) + 
  geom_vline(xintercept = mean(dt_houses$stories), linetype="dashed") + 
  theme_minimal() + 
  theme(axis.line = element_line(color = "#000000"))

mean(dt_houses$stories)
[1] 1.805505

we can see, that most of the houses are 1-2 stories.

Bathrooms are also interesting variable, so let’s take a look at histogram and a Boxplot bathrooms and price:

ggplot(data = dt_houses, aes(x = bathrooms)) + 
  geom_histogram(fill="#2f9e44", color="#2f9e44", alpha=0.25) + 
  geom_vline(xintercept = mean(dt_houses$bathrooms), linetype="dashed") + 
  theme_minimal() + 
  theme(axis.line = element_line(color = "#000000"))

ggplot(data = dt_houses, aes(x = factor(bathrooms), y = price)) +
  geom_boxplot() + 
  theme_minimal() 

here it is also almost obvious, that, if we have more bathrooms, price will be also up. Only one disadvantage, that in my dataset I do not have enough data about properties with 3 or 4 bathrooms, I have some on 3, but really luck on 4.

Furnishing is also important, many people search for apartments with furniture, but furniture could be not in a best shape or buyer may do not like the style. So from my opinion, it is not as strong(in prediction), as for example area.

How much real estate furnished or not:

ggplot(data = dt_houses, aes(x = factor(furnishingstatus), fill = factor(furnishingstatus))) + 
  geom_bar(color="#ced4da", alpha=0.25) + 
  scale_fill_viridis_d(option = "D") + 
  labs(title = "Bar Chart with Different Colors", 
       x = "Furnishing Status", 
       y = "Count") + 
  theme_minimal() + 
  theme(axis.line = element_line(color = "#000000"))

We can see, that most of the houses are semi-furnished. which is also logical, because when we sell a house or apartment, probably we would take in most of the cases the most valuable things for us and furniture included.

Now, it would be great, to look at price and area distribution in differently furnished properties

ggplot(data = dt_houses, aes(y = price, x = area)) + 
  geom_point(data = dt_houses, aes(y = price, x = area, color = bedrooms)) +
  geom_hline(yintercept = mean(dt_houses$price), linetype='dashed') + 
  facet_grid(.~furnishingstatus) +
  scale_y_continuous(labels = label_number(scale = 1e-6, suffix = "M")) +
  scale_color_distiller(type = "seq", palette = "Greens") +
  theme_minimal() + 
  theme(axis.line = element_line(color = "#000000"))

Also, on average, you can notice, that unfurnished houses, are less expensive.

We can also take a look on some pie charts:


dt_mainroad_counts <- as.data.frame(table(dt_houses$mainroad)) #table() - creates frequency table
colnames(dt_mainroad_counts) <- c("mainroad_status", "count")
dt_mainroad_counts$percentage <- round(dt_mainroad_counts$count / sum(dt_mainroad_counts$count) * 100, 1)

ggplot(data = dt_mainroad_counts, aes(x = "", y = count, fill = mainroad_status)) +
  geom_bar(stat = "identity", width = 1, color = "white") +
  coord_polar("y", start = 0) +
  geom_text(aes(label = paste0(percentage, "%")), 
            position = position_stack(vjust = 0.5), color = "white", size = 4) +  
  theme_void() +  
  scale_fill_manual(values = c("#F1B147", "#47B1F1")) + 
  labs(
    title = "Distribution of Mainroad Status",
    fill = "Mainroad Status"
  )

Almost 86 percent of houses have main road, so maybe this won’t be a strong predictor variable.


dt_airconditioning_counts <- as.data.frame(table(dt_houses$airconditioning)) #table() - creates frequency table
colnames(dt_airconditioning_counts) <- c("airconditioning_status", "count")
dt_airconditioning_counts$percentage <- round(dt_airconditioning_counts$count / sum(dt_airconditioning_counts$count) * 100, 1)

ggplot(data = dt_airconditioning_counts, aes(x = "", y = count, fill = airconditioning_status)) +
  geom_bar(stat = "identity", width = 1, color = "white") +
  coord_polar("y", start = 0) +
  geom_text(aes(label = paste0(percentage, "%")), 
            position = position_stack(vjust = 0.5), color = "white", size = 4) +  
  theme_void() +  
  scale_fill_manual(values = c("#F1B147", "#47B1F1")) + 
  labs(
    title = "Distribution of Airconditioning status",
    fill = "Airconditioning Status"
  )

Here 68.4 percent has airconditioning, but I do not know, how it will affect predictions.

I think that would be enough exploration and we can start with our first model.

Models 1 & 2

First, I would like to start pretty simple with linear model.

I consider to take all variables to my model, because they all seem to be very important.

Linear model

I will use lm function in R to find needed beta coefficients and create my model

price_lm <- lm(formula = price ~ area + bedrooms + hotwaterheating + airconditioning + stories + mainroad + parking + furnishingstatus + bathrooms + guestroom + basement + prefarea, data = dt_houses)

summary(price_lm)

Call:
lm(formula = price ~ area + bedrooms + hotwaterheating + airconditioning + 
    stories + mainroad + parking + furnishingstatus + bathrooms + 
    guestroom + basement + prefarea, data = dt_houses)

Residuals:
     Min       1Q   Median       3Q      Max 
-2619718  -657322   -68409   507176  5166695 

Coefficients:
                                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)                      42771.69  264313.31   0.162 0.871508    
area                               244.14      24.29  10.052  < 2e-16 ***
bedrooms                        114787.56   72598.66   1.581 0.114445    
hotwaterheatingyes              855447.15  223152.69   3.833 0.000141 ***
airconditioningyes              864958.31  108354.51   7.983 8.91e-15 ***
stories                         450848.00   64168.93   7.026 6.55e-12 ***
mainroadyes                     421272.59  142224.13   2.962 0.003193 ** 
parking                         277107.10   58525.89   4.735 2.82e-06 ***
furnishingstatussemi-furnished  -46344.62  116574.09  -0.398 0.691118    
furnishingstatusunfurnished    -411234.39  126210.56  -3.258 0.001192 ** 
bathrooms                       987668.11  103361.98   9.555  < 2e-16 ***
guestroomyes                    300525.86  131710.22   2.282 0.022901 *  
basementyes                     350106.90  110284.06   3.175 0.001587 ** 
prefareayes                     651543.80  115682.34   5.632 2.89e-08 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1068000 on 531 degrees of freedom
Multiple R-squared:  0.6818,    Adjusted R-squared:  0.674 
F-statistic: 87.52 on 13 and 531 DF,  p-value: < 2.2e-16

We got 0.68 R-squared, which is not that bad for a model just made up. But that’s not all, I will try to do better here, but first, another model.

But I would like to measure performance of my models with MSE, so I will calculate MSE for linear model.

price_lm_mse <- mean(price_lm$residuals^2)

price_lm_mse
[1] 1.111188e+12

Tree Model

I think this model could perform better, because there some variables which can affect this model not only linearly, but the other way, in this case tree model can show better performance

prices_tree <- rpart(data = dt_houses, formula = price ~ area + bedrooms + hotwaterheating + airconditioning + stories + mainroad + parking + furnishingstatus + bathrooms + guestroom + basement + prefarea, method = 'anova')

prp(prices_tree, digits = -3)

printcp(prices_tree)

Regression tree:
rpart(formula = price ~ area + bedrooms + hotwaterheating + airconditioning + 
    stories + mainroad + parking + furnishingstatus + bathrooms + 
    guestroom + basement + prefarea, data = dt_houses, method = "anova")

Variables actually used in tree construction:
[1] airconditioning  area             basement         bathrooms        furnishingstatus parking         

Root node error: 1.9032e+15/545 = 3.4921e+12

n= 545 

         CP nsplit rel error  xerror     xstd
1  0.304946      0   1.00000 1.00658 0.085403
2  0.094553      1   0.69505 0.71881 0.062940
3  0.053743      2   0.60050 0.61529 0.054163
4  0.026381      3   0.54676 0.59649 0.051506
5  0.024922      4   0.52038 0.57323 0.048914
6  0.022993      5   0.49546 0.56741 0.047781
7  0.021374      6   0.47246 0.55454 0.046047
8  0.015261      7   0.45109 0.52965 0.043470
9  0.013952      8   0.43583 0.52776 0.045583
10 0.012386      9   0.42188 0.53686 0.046821
11 0.010000     10   0.40949 0.52466 0.046506

Now I have biult with the help of rpart tree model based on my dataset, let explore it:

prices_tree
n= 545 

node), split, n, deviance, yval
      * denotes terminal node

 1) root 545 1.903208e+15 4766729  
   2) area< 5954 361 6.066751e+14 4029993  
     4) bathrooms< 1.5 293 3.297298e+14 3773561  
       8) area< 4016 174 1.437122e+14 3431227  
        16) furnishingstatus=unfurnished 78 4.036605e+13 2977962 *
        17) furnishingstatus=furnished,semi-furnished 96 7.430067e+13 3799505 *
       9) area>=4016 119 1.358098e+14 4274118 *
     5) bathrooms>=1.5 68 1.746610e+14 5134912  
      10) airconditioning=no 44 7.024826e+13 4563682 *
      11) airconditioning=yes 24 6.373358e+13 6182167 *
   3) area>=5954 184 7.161564e+14 6212174  
     6) bathrooms< 1.5 108 2.869179e+14 5382579  
      12) airconditioning=no 65 1.170629e+14 4843569  
        24) basement=no 38 5.226335e+13 4304816 *
        25) basement=yes 27 3.824662e+13 5601815 *
      13) airconditioning=yes 43 1.224240e+14 6197360 *
     7) bathrooms>=1.5 76 2.492851e+14 7391072  
      14) parking< 1.5 51 7.184700e+13 6859794 *
      15) parking>=1.5 25 1.336772e+14 8474878  
        30) airconditioning=no 10 5.146311e+13 7285600 *
        31) airconditioning=yes 15 5.864106e+13 9267729 *

Now it would be greate to prune the tree, because I do not want my tree to overfit:

plotcp(prices_tree)

prices_tree_min_cp <- prices_tree$cptable[which.min(prices_tree$cptable[, "xerror"]), "CP"]
model_tree <- prune(prices_tree, cp = prices_tree_min_cp )
prp(prices_tree,digits = -3)

after we pruned the tree, let’s calculate the MSE for the tree model

prices_tree_pred <- predict(prices_tree, dt_houses[, c("area","bathrooms", "bedrooms", "hotwaterheating", "airconditioning", "parking", "stories", "mainroad", "furnishingstatus", "guestroom", "basement", "prefarea")])
prices_tree_mse <- mean((dt_houses$price - prices_tree_pred)^2)

prices_tree_mse
[1] 1.429988e+12

Comparing two models

price linear model has a MSE of

price_lm_mse
[1] 1.111188e+12

price tree model has a MSE of

prices_tree_mse
[1] 1.429988e+12

It is surprising for me, as for a person who does not have a lot of experience in modelling, that linear model performs better than tree model by approx. 22%.

100 - price_lm_mse / prices_tree_mse * 100
[1] 22.29392

Feature Engineering

Feature 1

calculating overall amount of rooms

Here I would like to try all ideas and observations, which I’ve had through my course work. I’ve seen two columns, such as “bedrooms” and “bathrooms”, they store numerical value, amount of this kind of rooms. It makes sense for me to create a new column “room_count”, because it may have bigger impact on the performance.

Linear Model

dt_houses[, 'room_count' := bathrooms + bedrooms]

Let’s try Model with a new variable

price_lm_2 <- lm(formula = price ~ area + bedrooms + hotwaterheating + airconditioning + stories + mainroad + parking + furnishingstatus + bathrooms + guestroom + basement + prefarea + room_count, data = dt_houses)

summary(price_lm_2)

Call:
lm(formula = price ~ area + bedrooms + hotwaterheating + airconditioning + 
    stories + mainroad + parking + furnishingstatus + bathrooms + 
    guestroom + basement + prefarea + room_count, data = dt_houses)

Residuals:
     Min       1Q   Median       3Q      Max 
-2619718  -657322   -68409   507176  5166695 

Coefficients: (1 not defined because of singularities)
                                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)                      42771.69  264313.31   0.162 0.871508    
area                               244.14      24.29  10.052  < 2e-16 ***
bedrooms                        114787.56   72598.66   1.581 0.114445    
hotwaterheatingyes              855447.15  223152.69   3.833 0.000141 ***
airconditioningyes              864958.31  108354.51   7.983 8.91e-15 ***
stories                         450848.00   64168.93   7.026 6.55e-12 ***
mainroadyes                     421272.59  142224.13   2.962 0.003193 ** 
parking                         277107.10   58525.89   4.735 2.82e-06 ***
furnishingstatussemi-furnished  -46344.62  116574.09  -0.398 0.691118    
furnishingstatusunfurnished    -411234.39  126210.56  -3.258 0.001192 ** 
bathrooms                       987668.11  103361.98   9.555  < 2e-16 ***
guestroomyes                    300525.86  131710.22   2.282 0.022901 *  
basementyes                     350106.90  110284.06   3.175 0.001587 ** 
prefareayes                     651543.80  115682.34   5.632 2.89e-08 ***
room_count                             NA         NA      NA       NA    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1068000 on 531 degrees of freedom
Multiple R-squared:  0.6818,    Adjusted R-squared:  0.674 
F-statistic: 87.52 on 13 and 531 DF,  p-value: < 2.2e-16
mean(price_lm_2$residuals^2)
[1] 1.111188e+12

this is absolutely the same. We can see, that room_count has NA, that means, this variable do not make this model any better.

Tree Model

Comparing

Feature 2

Moving area closer to Gaussian (log transformation)

what if we will try to bring the area variable closer to Gaussian with log transformation, because area density is skewd to the left, log transformation can help us to normalise the variable.

Linear Model

dt_houses[, area_log := log(area)]

little visualisation:

ggplot(data = dt_houses, aes(x = area_log)) + 
  geom_density(fill="#f1b147", color="#f1b147", alpha=0.25) + 
  labs(
    x = 'Price',
    y = 'Density'
  ) +
  theme_minimal() + 
  theme(axis.line = element_line(color = "#000000"))

and try model again :)

price_lm_2 <- lm(formula = price ~ area + bedrooms + hotwaterheating + airconditioning + stories + mainroad + parking + furnishingstatus + bathrooms + guestroom + basement + prefarea + room_count + area_log, data = dt_houses)

summary(price_lm_2)

Call:
lm(formula = price ~ area + bedrooms + hotwaterheating + airconditioning + 
    stories + mainroad + parking + furnishingstatus + bathrooms + 
    guestroom + basement + prefarea + room_count + area_log, 
    data = dt_houses)

Residuals:
     Min       1Q   Median       3Q      Max 
-2607115  -665756   -73006   497325  5120891 

Coefficients: (1 not defined because of singularities)
                                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)                    -8.716e+06  3.455e+06  -2.523 0.011936 *  
area                            4.404e+01  8.233e+01   0.535 0.592912    
bedrooms                        1.175e+05  7.224e+04   1.627 0.104283    
hotwaterheatingyes              8.585e+05  2.220e+05   3.867 0.000124 ***
airconditioningyes              8.214e+05  1.092e+05   7.525 2.28e-13 ***
stories                         4.475e+05  6.386e+04   7.007 7.41e-12 ***
mainroadyes                     3.471e+05  1.445e+05   2.403 0.016608 *  
parking                         2.689e+05  5.832e+04   4.612 5.01e-06 ***
furnishingstatussemi-furnished -7.058e+04  1.164e+05  -0.607 0.544418    
furnishingstatusunfurnished    -4.288e+05  1.258e+05  -3.410 0.000699 ***
bathrooms                       9.814e+05  1.029e+05   9.540  < 2e-16 ***
guestroomyes                    2.419e+05  1.331e+05   1.818 0.069629 .  
basementyes                     3.678e+05  1.099e+05   3.345 0.000880 ***
prefareayes                     6.727e+05  1.154e+05   5.830 9.66e-09 ***
room_count                             NA         NA      NA       NA    
area_log                        1.169e+06  4.596e+05   2.542 0.011290 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1062000 on 530 degrees of freedom
Multiple R-squared:  0.6856,    Adjusted R-squared:  0.6773 
F-statistic: 82.57 on 14 and 530 DF,  p-value: < 2.2e-16
mean(price_lm_2$residuals^2)
[1] 1.097798e+12

it performs approx 0.4% better, if we look at R-Squared error. MSE also got for 0.2e+12 better.

Tree Model

Comparing

I think, this could be a good Idea to take a loot at a correlation between variables, but from Data exploration I can already say, that area correlates with price.

Here we are, correlation plot:

ggcorrplot(corr = cor(dt_houses[, .(price, area, bedrooms, bathrooms, stories, parking)]), 
           hc.order = TRUE,
           lab = TRUE)

Hm, correlation plot does not look as great, as I have expected, but the strongest correlation with price is area and amount of bathrooms.

Feature 3

Treat bathrooms as a factor variable

I got an Idea, we have bathrooms, and they are in range from 1 to 4.What if we will treat each amount of bathrooms as a factor variable.

Linear Model

# creating factor
dt_houses[, count_bathrooms_1 := 0][bathrooms == 1, count_bathrooms_1 := 1]
dt_houses[, count_bathrooms_2 := 0][bathrooms == 2, count_bathrooms_2 := 1]
dt_houses[, count_bathrooms_3 := 0][bathrooms == 3, count_bathrooms_3 := 1]
dt_houses[, count_bathrooms_4 := 0][bathrooms == 4, count_bathrooms_4 := 1]


price_lm_2 <- lm(formula = price ~ area + bedrooms + hotwaterheating + airconditioning + stories + mainroad + parking + furnishingstatus + guestroom + basement + prefarea + count_bathrooms_1 + count_bathrooms_2 + count_bathrooms_3 + count_bathrooms_4 + room_count + area_log, data = dt_houses)

summary(price_lm_2)

Call:
lm(formula = price ~ area + bedrooms + hotwaterheating + airconditioning + 
    stories + mainroad + parking + furnishingstatus + guestroom + 
    basement + prefarea + count_bathrooms_1 + count_bathrooms_2 + 
    count_bathrooms_3 + count_bathrooms_4 + room_count + area_log, 
    data = dt_houses)

Residuals:
     Min       1Q   Median       3Q      Max 
-2621190  -644381   -71750   495480  5189707 

Coefficients: (2 not defined because of singularities)
                                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)                    -3.592e+06  3.622e+06  -0.992 0.321720    
area                            2.858e+01  8.292e+01   0.345 0.730510    
bedrooms                        1.222e+05  7.219e+04   1.693 0.091082 .  
hotwaterheatingyes              8.739e+05  2.218e+05   3.941 9.21e-05 ***
airconditioningyes              8.293e+05  1.094e+05   7.583 1.53e-13 ***
stories                         4.479e+05  6.402e+04   6.995 8.05e-12 ***
mainroadyes                     3.481e+05  1.442e+05   2.414 0.016117 *  
parking                         2.607e+05  5.838e+04   4.465 9.78e-06 ***
furnishingstatussemi-furnished -7.002e+04  1.167e+05  -0.600 0.548761    
furnishingstatusunfurnished    -4.336e+05  1.261e+05  -3.439 0.000629 ***
guestroomyes                    2.461e+05  1.329e+05   1.851 0.064694 .  
basementyes                     3.747e+05  1.098e+05   3.413 0.000692 ***
prefareayes                     6.856e+05  1.154e+05   5.942 5.13e-09 ***
count_bathrooms_1              -4.727e+06  1.083e+06  -4.364 1.53e-05 ***
count_bathrooms_2              -3.838e+06  1.080e+06  -3.554 0.000413 ***
count_bathrooms_3              -2.573e+06  1.128e+06  -2.281 0.022944 *  
count_bathrooms_4                      NA         NA      NA       NA    
room_count                             NA         NA      NA       NA    
area_log                        1.247e+06  4.623e+05   2.697 0.007215 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1061000 on 528 degrees of freedom
Multiple R-squared:  0.688, Adjusted R-squared:  0.6785 
F-statistic: 72.75 on 16 and 528 DF,  p-value: < 2.2e-16
mean(price_lm_2$residuals^2)
[1] 1.089709e+12
1.097798e+12 - 1.089709e+12
[1] 8.089e+09

Yes! Now model creates, 0.008089e+12 less MSE. Slight improvement, but still great.

Tree Model

Comparing

Feature 4

Airconditioning as one-hot encoding

This could be the case, because if this dataset was gathered from a hot area, where summer is usually very warm, airconditioning could be very important factor, while buying a house and it will make place with it more attractive, but when there is not any, it could make place worse.

For example, if there is airconditioning there could be Beta = x, but if there is not, it is not 0, it is -y value from the property.

Let’s do this

# creating factors
dt_houses[, airconditioning_integer := 0][airconditioning == 'yes', airconditioning_yes := 1]
dt_houses[, airconditioning_integer := 0][airconditioning == 'no', airconditioning_no := 1]

# calculating and running model
price_lm_2 <- lm(formula = price ~ + area + bedrooms +  airconditioning + hotwaterheating + stories + mainroad + parking + furnishingstatus + guestroom + basement + prefarea + count_bathrooms_1 + count_bathrooms_2 + count_bathrooms_3 + count_bathrooms_4 + room_count + area_log + airconditioning_yes + airconditioning_no, data = dt_houses)

summary(price_lm_2)

Call:
lm(formula = price ~ +area + bedrooms + airconditioning + hotwaterheating + 
    stories + mainroad + parking + furnishingstatus + guestroom + 
    basement + prefarea + count_bathrooms_1 + count_bathrooms_2 + 
    count_bathrooms_3 + count_bathrooms_4 + room_count + area_log + 
    airconditioning_yes + airconditioning_no, data = dt_houses)

Residuals:
     Min       1Q   Median       3Q      Max 
-2621190  -644381   -71750   495480  5189707 

Coefficients: (4 not defined because of singularities)
                                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)                    -3.592e+06  3.622e+06  -0.992 0.321720    
area                            2.858e+01  8.292e+01   0.345 0.730510    
bedrooms                        1.222e+05  7.219e+04   1.693 0.091082 .  
airconditioningyes              8.293e+05  1.094e+05   7.583 1.53e-13 ***
hotwaterheatingyes              8.739e+05  2.218e+05   3.941 9.21e-05 ***
stories                         4.479e+05  6.402e+04   6.995 8.05e-12 ***
mainroadyes                     3.481e+05  1.442e+05   2.414 0.016117 *  
parking                         2.607e+05  5.838e+04   4.465 9.78e-06 ***
furnishingstatussemi-furnished -7.002e+04  1.167e+05  -0.600 0.548761    
furnishingstatusunfurnished    -4.336e+05  1.261e+05  -3.439 0.000629 ***
guestroomyes                    2.461e+05  1.329e+05   1.851 0.064694 .  
basementyes                     3.747e+05  1.098e+05   3.413 0.000692 ***
prefareayes                     6.856e+05  1.154e+05   5.942 5.13e-09 ***
count_bathrooms_1              -4.727e+06  1.083e+06  -4.364 1.53e-05 ***
count_bathrooms_2              -3.838e+06  1.080e+06  -3.554 0.000413 ***
count_bathrooms_3              -2.573e+06  1.128e+06  -2.281 0.022944 *  
count_bathrooms_4                      NA         NA      NA       NA    
room_count                             NA         NA      NA       NA    
area_log                        1.247e+06  4.623e+05   2.697 0.007215 ** 
airconditioning_yes                    NA         NA      NA       NA    
airconditioning_no                     NA         NA      NA       NA    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1061000 on 528 degrees of freedom
Multiple R-squared:  0.688, Adjusted R-squared:  0.6785 
F-statistic: 72.75 on 16 and 528 DF,  p-value: < 2.2e-16
mean(price_lm_2$residuals^2)
[1] 1.089709e+12

LS0tCnRpdGxlOiAiQ291cnNld29yayAtIERhdGEgU2NpZW5jZSBJIgphdXRob3I6ICJPbWFyIFpoYWR5a292LCAyMjAyMjA1MDMiCm91dHB1dDoKICBodG1sX25vdGVib29rOgogICAgZmlnX3dpZHRoOiAxMAogICAgdGhlbWU6IHNwYWNlbGFiCiAgICB0b2M6IHllcwogICAgdG9jX2RlcHRoOiAzCiAgICB0b2NfZmxvYXQ6IHllcwogIHdvcmRfZG9jdW1lbnQ6CiAgICB0b2M6IHllcwogICAgdG9jX2RlcHRoOiAnMycKICBwZGZfZG9jdW1lbnQ6IGRlZmF1bHQKICBodG1sX2RvY3VtZW50OgogICAgZmlnX3dpZHRoOiAxMAogICAgdGhlbWU6IHNwYWNlbGFiCiAgICB0b2M6IHllcwogICAgdG9jX2RlcHRoOiAzCiAgICB0b2NfZmxvYXQ6IHllcwotLS0KCjxzY3JpcHQ+CiQoZG9jdW1lbnQpLnJlYWR5KGZ1bmN0aW9uKCkgewogICRpdGVtcyA9ICQoJ2RpdiNUT0MgbGknKTsKICAkaXRlbXMuZWFjaChmdW5jdGlvbihpZHgpIHsKICAgIG51bV91bCA9ICQodGhpcykucGFyZW50c1VudGlsKCcjVE9DJykubGVuZ3RoOwogICAgJCh0aGlzKS5jc3Moeyd0ZXh0LWluZGVudCc6IG51bV91bCAqIDEwLCAncGFkZGluZy1sZWZ0JzogMH0pOwogIH0pOwoKfSk7Cjwvc2NyaXB0PgoKYGBge3Igc2V0dXAsIHdhcm5pbmc9RkFMU0UsIG1lc3NhZ2U9RkFMU0UsIGVjaG89RkFMU0V9CmxpYnJhcnkoc3ZnbGl0ZSkKbGlicmFyeShrbml0cikKc3VwcHJlc3NQYWNrYWdlU3RhcnR1cE1lc3NhZ2VzKGxpYnJhcnkoZGF0YS50YWJsZSkpCmxpYnJhcnkoZ2dwbG90MikKa25pdHI6Om9wdHNfY2h1bmskc2V0KGRldiA9ICJzdmdsaXRlIikKCiMgUHV0IHlvdXIgZGF0YXNldCBpbiB0aGUgc2FtZSBmb2xkZXIgYXMgeW91ciBSIGZpbGUuIFRoaXMgY29kZSB3aWxsIHNldCB5b3VyIHdvcmtpbmcgZGlyZWN0b3J5IGZvciB0aGlzIG5vdGVib29rIHRvIHRoZSBmb2xkZXIgd2hlcmUgdGhlIFIgZmlsZSBpcyBzdG9yZWQuIFRoaXMgd2F5IEkgY2FuIHJlcnVuIHlvdXIgY29kZSB3aXRob3V0IG1vZGlmaWNhdGlvbnMuCgpsaWJyYXJ5KHJzdHVkaW9hcGkpCnNldHdkKGRpcm5hbWUoZ2V0QWN0aXZlRG9jdW1lbnRDb250ZXh0KCkkcGF0aCkpCmBgYAoKIyBJbnRyb2R1Y3Rpb24KClRoaXMgY291cnNld29yayBmb2N1c2VzIG9uIGhvdXNpbmcgcHJpY2VzLCB3aXRoIHRoZSBtYWluIG9iamVjdGl2ZSBiZWluZyB0byBwcmVkaWN0IHRoZSBwcmljZSBvZiBhIHByb3BlcnR5IGJhc2VkIG9uIHZhcmlvdXMgaW5wdXRzLiBUaGUgaW5wdXRzIGluY2x1ZGUgZmVhdHVyZXMgc3VjaCBhcyB0aGUgYXJlYSwgdGhlIG51bWJlciBhbmQgdHlwZXMgb2Ygcm9vbXMsIGFuZCBhZGRpdGlvbmFsIGZhY3RvcnMgbGlrZSB0aGUgYXZhaWxhYmlsaXR5IG9mIGEgbWFpbiByb2FkLCBob3Qgd2F0ZXIgaGVhdGluZywgYW5kIG1vcmUuCgpUaGUgZGVwZW5kZW50IHZhcmlhYmxlIGlzIHRoZSBwcmljZSwgYXMgaXQgaXMgdGhlIHByaW1hcnkgY29uY2VybiBmb3IgbW9zdCBwZW9wbGUgc2VhcmNoaW5nIGZvciBhIGhvdXNlLiBUaGUgZ29hbCBvZiB0aGlzIHdvcmsgaXMgdG8gcHJlZGljdCB0aGUgcHJpY2UgYmFzZWQgb24gZGl2ZXJzZSBpbnB1dHMsIHdoaWNoIGNvbnNpc3Qgb2YgbWl4ZWQgZGF0YSB0eXBlcywgc3VjaCBhczoKCiAgLSBOdW1lcmljYWwgdmFsdWVzCiAgLSBUZXh0LWJhc2VkIHJlc3BvbnNlcyBsaWtlICJ5ZXMiIG9yICJubyIKICAtIENhdGVnb3JpZXMgZm9yIGZ1cm5pc2hpbmcgc3RhdHVzLCBpbmNsdWRpbmcgImZ1cm5pc2hlZCwiICJzZW1pLWZ1cm5pc2hlZCwiIG9yICJub24tZnVybmlzaGVkLiIKClRoaXMgcHJvamVjdCBhZGRyZXNzZXMgYSByZWdyZXNzaW9uIHByb2JsZW0gYmVjYXVzZSB0aGUgb2JqZWN0aXZlIGlzIHRvIHByZWRpY3QgYSBudW1lcmljIHZhbHVl4oCUaW4gdGhpcyBjYXNlLCB0aGUgcHJpY2Ugb2YgdGhlIHByb3BlcnR5LgoKIyBDb2xsZWN0aW9uIC8gUHJlcGFyYXRpb24gCgpOb3cgd2UgYXJlIGdvaW5nIHRvIGltcG9ydCBvdXIgZGF0YXNldCBpbnRvIHRoaXMgcHJvamVjdC4KCmBgYHtyfQpkdF9ob3VzZXMgPC0gZnJlYWQoZmlsZSA9ICIuL2RhdGFzZXRzL1JlZ3Jlc3Npb25fc2V0LmNzdiIpCmBgYAoKPGJyPgpJIHdvdWxkIGxpa2UgdG8gY2hlY2ssIGlmIGkgaGF2ZSBzb21lIG51bGxpc2ggZGF0YSBpbiBteSBkYXRhc2V0LiBJIHRoaW5rIGl0IGlzIGEgZ29vZCBpZGVhIHRvIGdvIHRocm91Z2ggYWxsIHJvd3MgYW5kIGNvbHVtcyBhbmQgY2hlY2ssIGlmIHRoZXJlIGlzIGEgTkEuIEkgd2FudCB0byBjaGVjayBpdCB3aXRoIGJ1aWx0LWluIGZ1bmN0aW9uIGluIFIgKmNvbXBsZXRlLmNhc2VzKGRhdGFfdGFibGUpKi4gVGhpcyBmdW5jdGlvbiByZXR1cm5zIFRSVUUgb3IgRkFMU0UgaWYgcm93IGNvbnRhaW5zIGEgTkEgdmFsdWUuCgpgYGB7cn0KbmFzIDwtIGR0X2hvdXNlc1shY29tcGxldGUuY2FzZXMoZHRfaG91c2VzKV0KbmFzCmBgYAoKVGhhdCBsb29rcyBncmVhdCwgbm93IHdlIGNhbiBleHBsb3JlIG91ciBkYXRhc2V0IDopCgojIEV4cGxvcmF0aW9uCgpCZWZvcmUgd2Ugd2lsbCBleHBsb3JlIG91ciBkYXRhLCBJIHdhbnQgdG8gaW1wb3J0IGFsbCBsaWJyYXJpZXMsIHdoaWNoIHdlIHdpbGwgcHJvYmFibHkgdXNlOgoKYGBge3J9CmxpYnJhcnkoZGF0YS50YWJsZSkKbGlicmFyeShnZ2NvcnJwbG90KQpsaWJyYXJ5KGdnRXh0cmEpCmxpYnJhcnkoZ2dwbG90MikKbGlicmFyeShnZ3JpZGdlcykKbGlicmFyeShnZ3NjaSkKbGlicmFyeShnZ3RoZW1lcykKbGlicmFyeShSQ29sb3JCcmV3ZXIpCmxpYnJhcnkoc3ZnbGl0ZSkKbGlicmFyeSh2aXJpZGlzKQpsaWJyYXJ5KHNjYWxlcykKbGlicmFyeShycGFydCkKbGlicmFyeShycGFydC5wbG90KQpgYGAKCkkgZm91bmQgc29tZSBoZWxwZnVsIGZ1bmN0aW9ucyBpbiBSLCBzbyB3ZSBjb3VsZCBoYXZlIGEgbG9vayBvbiBvdXIgZGF0YS4gV2Ugd2lsbCBzdGFydCB3aXRoIGEgc3RydWN0dXJlLCB0aGFuIHdlIHdpbGwgZ2V0IHNvbWUgc3RhdGlzdGljIGRhdGEgYW5kIHRha2UgYSAqaGVhZCgpKiBvZiB0aGUgZGF0YQoKYGBge3J9CnN0cihkdF9ob3VzZXMpCmBgYAo8YnI+ClN0YXRpc3RpYyBkYXRhOgpgYGB7cn0Kc3VtbWFyeShkdF9ob3VzZXNbLCAuKHByaWNlLCBhcmVhLCBiZWRyb29tcywgYmF0aHJvb21zLCBzdG9yaWVzLCBwYXJraW5nKV0pCmBgYAoKPGJyPgphbmQgdGhpcyBpcyBhIHNhbXBsZSBvZiBvdXIgZGF0YXNldDoKCmBgYHtyfQpoZWFkKGR0X2hvdXNlcykKYGBgCgpJIHdvdWxkIGxpa2UgdG8gc3RhcnQgZnJvbSBkZW5zaXR5IG9mIGEgbWFpbiB2YWx1ZXMsIHdoaWNoIGFyZSBmcm9tIG15IGRvbWFpbiBrbm93bGVkZ2UgYXJlIGltcG9ydGFudCBpbiBwcmljZSBvZiB0aGUgcHJvcGVydGllcwoKUHJpY2UgZGVuc2l0eTogCgpgYGB7cn0KZ2dwbG90KGRhdGEgPSBkdF9ob3VzZXMsIGFlcyh4ID0gcHJpY2UpKSArIAogIGdlb21fZGVuc2l0eShmaWxsPSIjZjFiMTQ3IiwgY29sb3I9IiNmMWIxNDciLCBhbHBoYT0wLjI1KSArIAogIGxhYnMoCiAgICB4ID0gJ1ByaWNlJywKICAgIHkgPSAnRGVuc2l0eScKICApICsKICBnZW9tX3ZsaW5lKHhpbnRlcmNlcHQgPSBtZWFuKGR0X2hvdXNlcyRwcmljZSksIGxpbmV0eXBlPSJkYXNoZWQiKSArIAogIHNjYWxlX3hfY29udGludW91cyhsYWJlbHMgPSBsYWJlbF9udW1iZXIoc2NhbGUgPSAxZS02LCBzdWZmaXggPSAiTSIpKSArIAogIHRoZW1lX21pbmltYWwoKSArIAogIHRoZW1lKGF4aXMubGluZSA9IGVsZW1lbnRfbGluZShjb2xvciA9ICIjMDAwMDAwIikpCmBgYAoKSXQgaXMgdmVyeSBjbGVhciwgdGhhdCBtb3N0IG9mIHRoZSBwcmljZXMgYXJlIGJldHdlZW4gMCBhbmQgfiA1IG1pbGxpb24uCgpBcmVhIGRlbnNpdHk6CgpgYGB7cn0KZ2dwbG90KGRhdGEgPSBkdF9ob3VzZXMsIGFlcyh4ID0gYXJlYSkpICsgCiAgZ2VvbV9kZW5zaXR5KGZpbGw9IiNmMWIxNDciLCBjb2xvcj0iI2YxYjE0NyIsIGFscGhhPTAuMjUpICsgCiAgbGFicygKICAgIHggPSAnUHJpY2UnLAogICAgeSA9ICdEZW5zaXR5JwogICkgKwogIHRoZW1lX21pbmltYWwoKSArIAogIHRoZW1lKGF4aXMubGluZSA9IGVsZW1lbnRfbGluZShjb2xvciA9ICIjMDAwMDAwIikpCmBgYApBcmVhIGRlbnNpdHkgbG9va3MgYSBsaXR0bGUgYml0IG1vcmUgY2VudGVyZWQsIGJ1dCBzdGlsbCBza2V3ZWQgdG8gdGhlIGxlZnQuCgo8YnI+CkhvdyBkb2VzIGFyZWEgYWZmZWN0IHByaWNlIG9mIHRoZSBob3VzZT8gV2Ugd2lsbCBwbG90IGl0IHdpdGggcG9pbnRzLCB3aGVyZSBwcmljZSBpcyBvbiB0aGUgeS1heGlzIGFuZCBhcmVhIG9uIHgtYXhpcy4KCmBgYHtyfQpnZ3Bsb3QoKSArIAogIGdlb21fcG9pbnQoZGF0YSA9IGR0X2hvdXNlcywgYWVzKHggPSBhcmVhLCB5ID0gcHJpY2UsIGNvbG9yID0gcGFya2luZykpICsKICBzY2FsZV95X2NvbnRpbnVvdXMobGFiZWxzID0gbGFiZWxfbnVtYmVyKHNjYWxlID0gMWUtNiwgc3VmZml4ID0gIk0iKSkgKyAKICB0aGVtZV9taW5pbWFsKCkgKyAKICB0aGVtZShheGlzLmxpbmUgPSBlbGVtZW50X2xpbmUoY29sb3IgPSAiIzAwMDAwMCIpKQpgYGAKClRoaXMgbG9va3MgbmljZSwgYW5kIGl0IGlzIGFsc28gbG9naWNhbCwgbW9yZSBzcGFjZSwgaGlnaGVyIHByaWNlLiBCdXQgaWYgd2UgdGFrZSBhIGxvb2sgYXQgcGFya2luZyBwbGFjZXMsIHRoZXJlIGlzIGhhcmQgdG8gc2VlIGEgdHJlbmQuCgpCdXQsIG5vdyBJIGhhdmUgdGhlIHNpbXBsZXN0IGlkZWEsIGhvdyBkb2VzIGFtb3VudCBvZiBiZWRyb29tcyBjb3JyZWxhdGVzIHdpdGggdGhlIHByaWNlLgoKYGBge3J9CmdncGxvdChkYXRhID0gZHRfaG91c2VzLCBhZXMoeCA9IGZhY3RvcihiZWRyb29tcyksIHkgPSBwcmljZSkpICsKICBnZW9tX2JveHBsb3QoKSArIAogIHRoZW1lX21pbmltYWwoKSAKYGBgCgpXZSBjYW4gc2VlLCB0aGF0IG9uIGF2ZXJhZ2UsIG1vcmUgYmVkcm9vbXMsIG1lYW5zIGhpZ2hlciBwcmljZSwgYnV0IEkgdGhpbmsgdGhlcmUgaXMgbm90IHJlYWxseSBzdHJvbmcgcmVsYXRpb25zaGlwIGJldHdlZW4gdGhpcyB0d28gdmFyaWFibGVzLgoKQWxzbyBpdCB3b3VsZCBiZSBncmVhdCB0byB0YWtlIGEgbG9vayBhdCBhIGJlZHJvb21zIGhpc3RvZ3JhbToKCmBgYHtyfQpnZ3Bsb3QoZGF0YSA9IGR0X2hvdXNlcywgYWVzKHggPSBiZWRyb29tcykpICsgCiAgZ2VvbV9oaXN0b2dyYW0oZmlsbD0iIzJmOWU0NCIsIGNvbG9yPSIjMmY5ZTQ0IiwgYWxwaGE9MC4yNSkgKyAKICBnZW9tX3ZsaW5lKHhpbnRlcmNlcHQgPSBtZWFuKGR0X2hvdXNlcyRiZWRyb29tcyksIGxpbmV0eXBlPSJkYXNoZWQiKSArIAogIHRoZW1lX21pbmltYWwoKSArIAogIHRoZW1lKGF4aXMubGluZSA9IGVsZW1lbnRfbGluZShjb2xvciA9ICIjMDAwMDAwIikpCmBgYAptZWFuIG9mIHRoZSBiZWRyb29tczoKYGBge3J9Cm1lYW4oZHRfaG91c2VzJGJlZHJvb21zKQpgYGAKCgpIZXJlIHdlIGNhbiBzZWUsIHRoYXQgdGhlIG1vc3Qgb2YgdGhlIHByb3BlcnRpZXMgdGVuZCB0byBoYXZlIDIsIDMgb3IgNCByb29tcy4gCgpMZXQncyBoYXZlIGEgbG9vayBhdCBoaXN0b2dyYW0gb2Ygc3RvcmllczogCgpgYGB7cn0KZ2dwbG90KGRhdGEgPSBkdF9ob3VzZXMsIGFlcyh4ID0gc3RvcmllcykpICsgCiAgZ2VvbV9oaXN0b2dyYW0oZmlsbD0iIzJmOWU0NCIsIGNvbG9yPSIjMmY5ZTQ0IiwgYWxwaGE9MC4yNSkgKyAKICBnZW9tX3ZsaW5lKHhpbnRlcmNlcHQgPSBtZWFuKGR0X2hvdXNlcyRzdG9yaWVzKSwgbGluZXR5cGU9ImRhc2hlZCIpICsgCiAgdGhlbWVfbWluaW1hbCgpICsgCiAgdGhlbWUoYXhpcy5saW5lID0gZWxlbWVudF9saW5lKGNvbG9yID0gIiMwMDAwMDAiKSkKYGBgCgpgYGB7cn0KbWVhbihkdF9ob3VzZXMkc3RvcmllcykKYGBgCgp3ZSBjYW4gc2VlLCB0aGF0IG1vc3Qgb2YgdGhlIGhvdXNlcyBhcmUgMS0yIHN0b3JpZXMuCgpCYXRocm9vbXMgYXJlIGFsc28gaW50ZXJlc3RpbmcgdmFyaWFibGUsIHNvIGxldCdzIHRha2UgYSBsb29rIGF0IGhpc3RvZ3JhbSBhbmQgYSBCb3hwbG90IGJhdGhyb29tcyBhbmQgcHJpY2U6CmBgYHtyfQpnZ3Bsb3QoZGF0YSA9IGR0X2hvdXNlcywgYWVzKHggPSBiYXRocm9vbXMpKSArIAogIGdlb21faGlzdG9ncmFtKGZpbGw9IiMyZjllNDQiLCBjb2xvcj0iIzJmOWU0NCIsIGFscGhhPTAuMjUpICsgCiAgZ2VvbV92bGluZSh4aW50ZXJjZXB0ID0gbWVhbihkdF9ob3VzZXMkYmF0aHJvb21zKSwgbGluZXR5cGU9ImRhc2hlZCIpICsgCiAgdGhlbWVfbWluaW1hbCgpICsgCiAgdGhlbWUoYXhpcy5saW5lID0gZWxlbWVudF9saW5lKGNvbG9yID0gIiMwMDAwMDAiKSkKYGBgCgoKYGBge3J9CmdncGxvdChkYXRhID0gZHRfaG91c2VzLCBhZXMoeCA9IGZhY3RvcihiYXRocm9vbXMpLCB5ID0gcHJpY2UpKSArCiAgZ2VvbV9ib3hwbG90KCkgKyAKICB0aGVtZV9taW5pbWFsKCkgCmBgYAoKaGVyZSBpdCBpcyBhbHNvIGFsbW9zdCBvYnZpb3VzLCB0aGF0LCBpZiB3ZSBoYXZlIG1vcmUgYmF0aHJvb21zLCBwcmljZSB3aWxsIGJlIGFsc28gdXAuIE9ubHkgb25lIGRpc2FkdmFudGFnZSwgdGhhdCBpbiBteSBkYXRhc2V0IEkgZG8gbm90IGhhdmUgZW5vdWdoIGRhdGEgYWJvdXQgcHJvcGVydGllcyB3aXRoIDMgb3IgNCBiYXRocm9vbXMsIEkgaGF2ZSBzb21lIG9uIDMsIGJ1dCByZWFsbHkgbHVjayBvbiA0LgoKRnVybmlzaGluZyBpcyBhbHNvIGltcG9ydGFudCwgbWFueSBwZW9wbGUgc2VhcmNoIGZvciBhcGFydG1lbnRzIHdpdGggZnVybml0dXJlLCBidXQgZnVybml0dXJlIGNvdWxkIGJlIG5vdCBpbiBhIGJlc3Qgc2hhcGUgb3IgYnV5ZXIgbWF5IGRvIG5vdCBsaWtlIHRoZSBzdHlsZS4gU28gZnJvbSBteSBvcGluaW9uLCBpdCBpcyBub3QgYXMgc3Ryb25nKGluIHByZWRpY3Rpb24pLCBhcyBmb3IgZXhhbXBsZSBhcmVhLgoKSG93IG11Y2ggcmVhbCBlc3RhdGUgZnVybmlzaGVkIG9yIG5vdDoKCmBgYHtyfQpnZ3Bsb3QoZGF0YSA9IGR0X2hvdXNlcywgYWVzKHggPSBmYWN0b3IoZnVybmlzaGluZ3N0YXR1cyksIGZpbGwgPSBmYWN0b3IoZnVybmlzaGluZ3N0YXR1cykpKSArIAogIGdlb21fYmFyKGNvbG9yPSIjY2VkNGRhIiwgYWxwaGE9MC4yNSkgKyAKICBzY2FsZV9maWxsX3ZpcmlkaXNfZChvcHRpb24gPSAiRCIpICsgCiAgbGFicyh0aXRsZSA9ICJCYXIgQ2hhcnQgd2l0aCBEaWZmZXJlbnQgQ29sb3JzIiwgCiAgICAgICB4ID0gIkZ1cm5pc2hpbmcgU3RhdHVzIiwgCiAgICAgICB5ID0gIkNvdW50IikgKyAKICB0aGVtZV9taW5pbWFsKCkgKyAKICB0aGVtZShheGlzLmxpbmUgPSBlbGVtZW50X2xpbmUoY29sb3IgPSAiIzAwMDAwMCIpKQpgYGAKCldlIGNhbiBzZWUsIHRoYXQgbW9zdCBvZiB0aGUgaG91c2VzIGFyZSBzZW1pLWZ1cm5pc2hlZC4gd2hpY2ggaXMgYWxzbyBsb2dpY2FsLCBiZWNhdXNlIHdoZW4gd2Ugc2VsbCBhIGhvdXNlIG9yIGFwYXJ0bWVudCwgcHJvYmFibHkgd2Ugd291bGQgdGFrZSBpbiBtb3N0IG9mIHRoZSBjYXNlcyB0aGUgbW9zdCB2YWx1YWJsZSB0aGluZ3MgZm9yIHVzIGFuZCBmdXJuaXR1cmUgaW5jbHVkZWQuCgpOb3csIGl0IHdvdWxkIGJlIGdyZWF0LCB0byBsb29rIGF0IHByaWNlIGFuZCBhcmVhIGRpc3RyaWJ1dGlvbiBpbiBkaWZmZXJlbnRseSBmdXJuaXNoZWQgcHJvcGVydGllcwoKCmBgYHtyfQpnZ3Bsb3QoZGF0YSA9IGR0X2hvdXNlcywgYWVzKHkgPSBwcmljZSwgeCA9IGFyZWEpKSArIAogIGdlb21fcG9pbnQoZGF0YSA9IGR0X2hvdXNlcywgYWVzKHkgPSBwcmljZSwgeCA9IGFyZWEsIGNvbG9yID0gYmVkcm9vbXMpKSArCiAgZ2VvbV9obGluZSh5aW50ZXJjZXB0ID0gbWVhbihkdF9ob3VzZXMkcHJpY2UpLCBsaW5ldHlwZT0nZGFzaGVkJykgKyAKICBmYWNldF9ncmlkKC5+ZnVybmlzaGluZ3N0YXR1cykgKwogIHNjYWxlX3lfY29udGludW91cyhsYWJlbHMgPSBsYWJlbF9udW1iZXIoc2NhbGUgPSAxZS02LCBzdWZmaXggPSAiTSIpKSArCiAgc2NhbGVfY29sb3JfZGlzdGlsbGVyKHR5cGUgPSAic2VxIiwgcGFsZXR0ZSA9ICJHcmVlbnMiKSArCiAgdGhlbWVfbWluaW1hbCgpICsgCiAgdGhlbWUoYXhpcy5saW5lID0gZWxlbWVudF9saW5lKGNvbG9yID0gIiMwMDAwMDAiKSkKYGBgCgpBbHNvLCBvbiBhdmVyYWdlLCB5b3UgY2FuIG5vdGljZSwgdGhhdCB1bmZ1cm5pc2hlZCBob3VzZXMsIGFyZSBsZXNzIGV4cGVuc2l2ZS4KCldlIGNhbiBhbHNvIHRha2UgYSBsb29rIG9uIHNvbWUgcGllIGNoYXJ0czoKCmBgYHtyfQoKZHRfbWFpbnJvYWRfY291bnRzIDwtIGFzLmRhdGEuZnJhbWUodGFibGUoZHRfaG91c2VzJG1haW5yb2FkKSkgI3RhYmxlKCkgLSBjcmVhdGVzIGZyZXF1ZW5jeSB0YWJsZQpjb2xuYW1lcyhkdF9tYWlucm9hZF9jb3VudHMpIDwtIGMoIm1haW5yb2FkX3N0YXR1cyIsICJjb3VudCIpCmR0X21haW5yb2FkX2NvdW50cyRwZXJjZW50YWdlIDwtIHJvdW5kKGR0X21haW5yb2FkX2NvdW50cyRjb3VudCAvIHN1bShkdF9tYWlucm9hZF9jb3VudHMkY291bnQpICogMTAwLCAxKQoKZ2dwbG90KGRhdGEgPSBkdF9tYWlucm9hZF9jb3VudHMsIGFlcyh4ID0gIiIsIHkgPSBjb3VudCwgZmlsbCA9IG1haW5yb2FkX3N0YXR1cykpICsKICBnZW9tX2JhcihzdGF0ID0gImlkZW50aXR5Iiwgd2lkdGggPSAxLCBjb2xvciA9ICJ3aGl0ZSIpICsKICBjb29yZF9wb2xhcigieSIsIHN0YXJ0ID0gMCkgKwogIGdlb21fdGV4dChhZXMobGFiZWwgPSBwYXN0ZTAocGVyY2VudGFnZSwgIiUiKSksIAogICAgICAgICAgICBwb3NpdGlvbiA9IHBvc2l0aW9uX3N0YWNrKHZqdXN0ID0gMC41KSwgY29sb3IgPSAid2hpdGUiLCBzaXplID0gNCkgKyAgCiAgdGhlbWVfdm9pZCgpICsgIAogIHNjYWxlX2ZpbGxfbWFudWFsKHZhbHVlcyA9IGMoIiNGMUIxNDciLCAiIzQ3QjFGMSIpKSArIAogIGxhYnMoCiAgICB0aXRsZSA9ICJEaXN0cmlidXRpb24gb2YgTWFpbnJvYWQgU3RhdHVzIiwKICAgIGZpbGwgPSAiTWFpbnJvYWQgU3RhdHVzIgogICkKCmBgYAoKQWxtb3N0IDg2IHBlcmNlbnQgb2YgaG91c2VzIGhhdmUgbWFpbiByb2FkLCBzbyBtYXliZSB0aGlzIHdvbid0IGJlIGEgc3Ryb25nIHByZWRpY3RvciB2YXJpYWJsZS4KCgpgYGB7cn0KCmR0X2FpcmNvbmRpdGlvbmluZ19jb3VudHMgPC0gYXMuZGF0YS5mcmFtZSh0YWJsZShkdF9ob3VzZXMkYWlyY29uZGl0aW9uaW5nKSkgI3RhYmxlKCkgLSBjcmVhdGVzIGZyZXF1ZW5jeSB0YWJsZQpjb2xuYW1lcyhkdF9haXJjb25kaXRpb25pbmdfY291bnRzKSA8LSBjKCJhaXJjb25kaXRpb25pbmdfc3RhdHVzIiwgImNvdW50IikKZHRfYWlyY29uZGl0aW9uaW5nX2NvdW50cyRwZXJjZW50YWdlIDwtIHJvdW5kKGR0X2FpcmNvbmRpdGlvbmluZ19jb3VudHMkY291bnQgLyBzdW0oZHRfYWlyY29uZGl0aW9uaW5nX2NvdW50cyRjb3VudCkgKiAxMDAsIDEpCgpnZ3Bsb3QoZGF0YSA9IGR0X2FpcmNvbmRpdGlvbmluZ19jb3VudHMsIGFlcyh4ID0gIiIsIHkgPSBjb3VudCwgZmlsbCA9IGFpcmNvbmRpdGlvbmluZ19zdGF0dXMpKSArCiAgZ2VvbV9iYXIoc3RhdCA9ICJpZGVudGl0eSIsIHdpZHRoID0gMSwgY29sb3IgPSAid2hpdGUiKSArCiAgY29vcmRfcG9sYXIoInkiLCBzdGFydCA9IDApICsKICBnZW9tX3RleHQoYWVzKGxhYmVsID0gcGFzdGUwKHBlcmNlbnRhZ2UsICIlIikpLCAKICAgICAgICAgICAgcG9zaXRpb24gPSBwb3NpdGlvbl9zdGFjayh2anVzdCA9IDAuNSksIGNvbG9yID0gIndoaXRlIiwgc2l6ZSA9IDQpICsgIAogIHRoZW1lX3ZvaWQoKSArICAKICBzY2FsZV9maWxsX21hbnVhbCh2YWx1ZXMgPSBjKCIjRjFCMTQ3IiwgIiM0N0IxRjEiKSkgKyAKICBsYWJzKAogICAgdGl0bGUgPSAiRGlzdHJpYnV0aW9uIG9mIEFpcmNvbmRpdGlvbmluZyBzdGF0dXMiLAogICAgZmlsbCA9ICJBaXJjb25kaXRpb25pbmcgU3RhdHVzIgogICkKCmBgYAoKSGVyZSA2OC40IHBlcmNlbnQgaGFzIGFpcmNvbmRpdGlvbmluZywgYnV0IEkgZG8gbm90IGtub3csIGhvdyBpdCB3aWxsIGFmZmVjdCBwcmVkaWN0aW9ucy4KCgpJIHRoaW5rIHRoYXQgd291bGQgYmUgZW5vdWdoIGV4cGxvcmF0aW9uIGFuZCB3ZSBjYW4gc3RhcnQgd2l0aCBvdXIgZmlyc3QgbW9kZWwuCgojIE1vZGVscyAxICYgMgoKRmlyc3QsIEkgd291bGQgbGlrZSB0byBzdGFydCBwcmV0dHkgc2ltcGxlIHdpdGggbGluZWFyIG1vZGVsLgoKSSBjb25zaWRlciB0byB0YWtlIGFsbCB2YXJpYWJsZXMgdG8gbXkgbW9kZWwsIGJlY2F1c2UgdGhleSBhbGwgc2VlbSB0byBiZSB2ZXJ5IGltcG9ydGFudC4KCiMjIExpbmVhciBtb2RlbAoKSSB3aWxsIHVzZSBsbSBmdW5jdGlvbiBpbiBSIHRvIGZpbmQgbmVlZGVkIGJldGEgY29lZmZpY2llbnRzIGFuZCBjcmVhdGUgbXkgbW9kZWwKCmBgYHtyfQpwcmljZV9sbSA8LSBsbShmb3JtdWxhID0gcHJpY2UgfiBhcmVhICsgYmVkcm9vbXMgKyBob3R3YXRlcmhlYXRpbmcgKyBhaXJjb25kaXRpb25pbmcgKyBzdG9yaWVzICsgbWFpbnJvYWQgKyBwYXJraW5nICsgZnVybmlzaGluZ3N0YXR1cyArIGJhdGhyb29tcyArIGd1ZXN0cm9vbSArIGJhc2VtZW50ICsgcHJlZmFyZWEsIGRhdGEgPSBkdF9ob3VzZXMpCgpzdW1tYXJ5KHByaWNlX2xtKQpgYGAKCldlIGdvdCAwLjY4IFItc3F1YXJlZCwgd2hpY2ggaXMgbm90IHRoYXQgYmFkIGZvciBhIG1vZGVsIGp1c3QgbWFkZSB1cC4gQnV0IHRoYXQncyBub3QgYWxsLCBJIHdpbGwgdHJ5IHRvIGRvIGJldHRlciBoZXJlLCBidXQgZmlyc3QsIGFub3RoZXIgbW9kZWwuCgpCdXQgSSB3b3VsZCBsaWtlIHRvIG1lYXN1cmUgcGVyZm9ybWFuY2Ugb2YgbXkgbW9kZWxzIHdpdGggTVNFLCBzbyBJIHdpbGwgY2FsY3VsYXRlIE1TRSBmb3IgbGluZWFyIG1vZGVsLgoKYGBge3J9CnByaWNlX2xtX21zZSA8LSBtZWFuKHByaWNlX2xtJHJlc2lkdWFsc14yKQoKcHJpY2VfbG1fbXNlCmBgYAoKCiMjIFRyZWUgTW9kZWwKCkkgdGhpbmsgdGhpcyBtb2RlbCBjb3VsZCBwZXJmb3JtIGJldHRlciwgYmVjYXVzZSB0aGVyZSBzb21lIHZhcmlhYmxlcyB3aGljaCBjYW4gYWZmZWN0IHRoaXMgbW9kZWwgbm90IG9ubHkgbGluZWFybHksIGJ1dCB0aGUgb3RoZXIgd2F5LCBpbiB0aGlzIGNhc2UgdHJlZSBtb2RlbCBjYW4gc2hvdyBiZXR0ZXIgcGVyZm9ybWFuY2UKCmBgYHtyfQpwcmljZXNfdHJlZSA8LSBycGFydChkYXRhID0gZHRfaG91c2VzLCBmb3JtdWxhID0gcHJpY2UgfiBhcmVhICsgYmVkcm9vbXMgKyBob3R3YXRlcmhlYXRpbmcgKyBhaXJjb25kaXRpb25pbmcgKyBzdG9yaWVzICsgbWFpbnJvYWQgKyBwYXJraW5nICsgZnVybmlzaGluZ3N0YXR1cyArIGJhdGhyb29tcyArIGd1ZXN0cm9vbSArIGJhc2VtZW50ICsgcHJlZmFyZWEsIG1ldGhvZCA9ICdhbm92YScpCgpwcnAocHJpY2VzX3RyZWUsIGRpZ2l0cyA9IC0zKQpgYGAKCmBgYHtyfQpwcmludGNwKHByaWNlc190cmVlKQpgYGAKCk5vdyBJIGhhdmUgYml1bHQgd2l0aCB0aGUgaGVscCBvZiBycGFydCB0cmVlIG1vZGVsIGJhc2VkIG9uIG15IGRhdGFzZXQsIGxldCBleHBsb3JlIGl0OgoKYGBge3J9CnByaWNlc190cmVlCmBgYAoKTm93IGl0IHdvdWxkIGJlIGdyZWF0ZSB0byBwcnVuZSB0aGUgdHJlZSwgYmVjYXVzZSBJIGRvIG5vdCB3YW50IG15IHRyZWUgdG8gb3ZlcmZpdDoKCmBgYHtyfQpwbG90Y3AocHJpY2VzX3RyZWUpCmBgYAoKCmBgYHtyfQpwcmljZXNfdHJlZV9taW5fY3AgPC0gcHJpY2VzX3RyZWUkY3B0YWJsZVt3aGljaC5taW4ocHJpY2VzX3RyZWUkY3B0YWJsZVssICJ4ZXJyb3IiXSksICJDUCJdCm1vZGVsX3RyZWUgPC0gcHJ1bmUocHJpY2VzX3RyZWUsIGNwID0gcHJpY2VzX3RyZWVfbWluX2NwICkKcHJwKHByaWNlc190cmVlLGRpZ2l0cyA9IC0zKQpgYGAKCmFmdGVyIHdlIHBydW5lZCB0aGUgdHJlZSwgbGV0J3MgY2FsY3VsYXRlIHRoZSBNU0UgZm9yIHRoZSB0cmVlIG1vZGVsCgoKYGBge3J9CnByaWNlc190cmVlX3ByZWQgPC0gcHJlZGljdChwcmljZXNfdHJlZSwgZHRfaG91c2VzWywgYygiYXJlYSIsImJhdGhyb29tcyIsICJiZWRyb29tcyIsICJob3R3YXRlcmhlYXRpbmciLCAiYWlyY29uZGl0aW9uaW5nIiwgInBhcmtpbmciLCAic3RvcmllcyIsICJtYWlucm9hZCIsICJmdXJuaXNoaW5nc3RhdHVzIiwgImd1ZXN0cm9vbSIsICJiYXNlbWVudCIsICJwcmVmYXJlYSIpXSkKcHJpY2VzX3RyZWVfbXNlIDwtIG1lYW4oKGR0X2hvdXNlcyRwcmljZSAtIHByaWNlc190cmVlX3ByZWQpXjIpCgpwcmljZXNfdHJlZV9tc2UKYGBgCgoKIyMgQ29tcGFyaW5nIHR3byBtb2RlbHMKCnByaWNlIGxpbmVhciBtb2RlbCBoYXMgYSBNU0Ugb2YgCgpgYGB7cn0KcHJpY2VfbG1fbXNlCmBgYAoKcHJpY2UgdHJlZSBtb2RlbCBoYXMgYSBNU0Ugb2YgCgpgYGB7cn0KcHJpY2VzX3RyZWVfbXNlCmBgYAoKCkl0IGlzIHN1cnByaXNpbmcgZm9yIG1lLCBhcyBmb3IgYSBwZXJzb24gd2hvIGRvZXMgbm90IGhhdmUgYSBsb3Qgb2YgZXhwZXJpZW5jZSBpbiBtb2RlbGxpbmcsIHRoYXQgbGluZWFyIG1vZGVsIHBlcmZvcm1zIGJldHRlciB0aGFuIHRyZWUgbW9kZWwgYnkgYXBwcm94LiAyMiUuIAoKYGBge3J9CjEwMCAtIHByaWNlX2xtX21zZSAvIHByaWNlc190cmVlX21zZSAqIDEwMApgYGAKCgojIEZlYXR1cmUgRW5naW5lZXJpbmcKCiMjIEZlYXR1cmUgMQojIyMjIGNhbGN1bGF0aW5nIG92ZXJhbGwgYW1vdW50IG9mIHJvb21zCgpIZXJlIEkgd291bGQgbGlrZSB0byB0cnkgYWxsIGlkZWFzIGFuZCBvYnNlcnZhdGlvbnMsIHdoaWNoIEkndmUgaGFkIHRocm91Z2ggbXkgY291cnNlIHdvcmsuIEkndmUgc2VlbiB0d28gY29sdW1ucywgc3VjaCBhcyAiYmVkcm9vbXMiIGFuZCAiYmF0aHJvb21zIiwgdGhleSBzdG9yZSBudW1lcmljYWwgdmFsdWUsIGFtb3VudCBvZiB0aGlzIGtpbmQgb2Ygcm9vbXMuIEl0IG1ha2VzIHNlbnNlIGZvciBtZSB0byBjcmVhdGUgYSBuZXcgY29sdW1uICJyb29tX2NvdW50IiwgYmVjYXVzZSBpdCBtYXkgaGF2ZSBiaWdnZXIgaW1wYWN0IG9uIHRoZSBwZXJmb3JtYW5jZS4KCiMjIyBMaW5lYXIgTW9kZWwKCgpgYGB7cn0KZHRfaG91c2VzWywgJ3Jvb21fY291bnQnIDo9IGJhdGhyb29tcyArIGJlZHJvb21zXQpgYGAKCgpMZXQncyB0cnkgTW9kZWwgd2l0aCBhIG5ldyB2YXJpYWJsZQoKYGBge3J9CnByaWNlX2xtXzIgPC0gbG0oZm9ybXVsYSA9IHByaWNlIH4gYXJlYSArIGJlZHJvb21zICsgaG90d2F0ZXJoZWF0aW5nICsgYWlyY29uZGl0aW9uaW5nICsgc3RvcmllcyArIG1haW5yb2FkICsgcGFya2luZyArIGZ1cm5pc2hpbmdzdGF0dXMgKyBiYXRocm9vbXMgKyBndWVzdHJvb20gKyBiYXNlbWVudCArIHByZWZhcmVhICsgcm9vbV9jb3VudCwgZGF0YSA9IGR0X2hvdXNlcykKCnN1bW1hcnkocHJpY2VfbG1fMikKbWVhbihwcmljZV9sbV8yJHJlc2lkdWFsc14yKQpgYGAKCnRoaXMgaXMgYWJzb2x1dGVseSB0aGUgc2FtZS4gV2UgY2FuIHNlZSwgdGhhdCByb29tX2NvdW50IGhhcyBOQSwgdGhhdCBtZWFucywgdGhpcyB2YXJpYWJsZSBkbyBub3QgbWFrZSB0aGlzIG1vZGVsIGFueSBiZXR0ZXIuCgojIyMgVHJlZSBNb2RlbAoKIyMjIENvbXBhcmluZwoKCiMjIEZlYXR1cmUgMgojIyMjIE1vdmluZyBhcmVhIGNsb3NlciB0byBHYXVzc2lhbiAobG9nIHRyYW5zZm9ybWF0aW9uKQoKd2hhdCBpZiB3ZSB3aWxsIHRyeSB0byBicmluZyB0aGUgYXJlYSB2YXJpYWJsZSBjbG9zZXIgdG8gR2F1c3NpYW4gd2l0aCBsb2cgdHJhbnNmb3JtYXRpb24sIGJlY2F1c2UgYXJlYSBkZW5zaXR5IGlzIHNrZXdkIHRvIHRoZSBsZWZ0LCBsb2cgdHJhbnNmb3JtYXRpb24gY2FuIGhlbHAgdXMgdG8gbm9ybWFsaXNlIHRoZSB2YXJpYWJsZS4KCiMjIyBMaW5lYXIgTW9kZWwKCgpgYGB7cn0KZHRfaG91c2VzWywgYXJlYV9sb2cgOj0gbG9nKGFyZWEpXQpgYGAKCgpsaXR0bGUgdmlzdWFsaXNhdGlvbjoKCmBgYHtyfQpnZ3Bsb3QoZGF0YSA9IGR0X2hvdXNlcywgYWVzKHggPSBhcmVhX2xvZykpICsgCiAgZ2VvbV9kZW5zaXR5KGZpbGw9IiNmMWIxNDciLCBjb2xvcj0iI2YxYjE0NyIsIGFscGhhPTAuMjUpICsgCiAgbGFicygKICAgIHggPSAnUHJpY2UnLAogICAgeSA9ICdEZW5zaXR5JwogICkgKwogIHRoZW1lX21pbmltYWwoKSArIAogIHRoZW1lKGF4aXMubGluZSA9IGVsZW1lbnRfbGluZShjb2xvciA9ICIjMDAwMDAwIikpCmBgYAoKYW5kIHRyeSBtb2RlbCBhZ2FpbiA6KQoKYGBge3J9CnByaWNlX2xtXzIgPC0gbG0oZm9ybXVsYSA9IHByaWNlIH4gYXJlYSArIGJlZHJvb21zICsgaG90d2F0ZXJoZWF0aW5nICsgYWlyY29uZGl0aW9uaW5nICsgc3RvcmllcyArIG1haW5yb2FkICsgcGFya2luZyArIGZ1cm5pc2hpbmdzdGF0dXMgKyBiYXRocm9vbXMgKyBndWVzdHJvb20gKyBiYXNlbWVudCArIHByZWZhcmVhICsgcm9vbV9jb3VudCArIGFyZWFfbG9nLCBkYXRhID0gZHRfaG91c2VzKQoKc3VtbWFyeShwcmljZV9sbV8yKQptZWFuKHByaWNlX2xtXzIkcmVzaWR1YWxzXjIpCmBgYAoKaXQgcGVyZm9ybXMgYXBwcm94IDAuNCUgYmV0dGVyLCBpZiB3ZSBsb29rIGF0IFItU3F1YXJlZCBlcnJvci4gTVNFIGFsc28gZ290IGZvciAwLjJlKzEyIGJldHRlci4KCiMjIyBUcmVlIE1vZGVsCgojIyMgQ29tcGFyaW5nCgoKSSB0aGluaywgdGhpcyBjb3VsZCBiZSBhIGdvb2QgSWRlYSB0byB0YWtlIGEgbG9vdCBhdCBhIGNvcnJlbGF0aW9uIGJldHdlZW4gdmFyaWFibGVzLCBidXQgZnJvbSBEYXRhIGV4cGxvcmF0aW9uIEkgY2FuIGFscmVhZHkgc2F5LCB0aGF0IGFyZWEgY29ycmVsYXRlcyB3aXRoIHByaWNlLgoKSGVyZSB3ZSBhcmUsIGNvcnJlbGF0aW9uIHBsb3Q6CgpgYGB7cn0KZ2djb3JycGxvdChjb3JyID0gY29yKGR0X2hvdXNlc1ssIC4ocHJpY2UsIGFyZWEsIGJlZHJvb21zLCBiYXRocm9vbXMsIHN0b3JpZXMsIHBhcmtpbmcpXSksIAogICAgICAgICAgIGhjLm9yZGVyID0gVFJVRSwKICAgICAgICAgICBsYWIgPSBUUlVFKQpgYGAKCkhtLCBjb3JyZWxhdGlvbiBwbG90IGRvZXMgbm90IGxvb2sgYXMgZ3JlYXQsIGFzIEkgaGF2ZSBleHBlY3RlZCwgYnV0IHRoZSBzdHJvbmdlc3QgY29ycmVsYXRpb24gd2l0aCBwcmljZSBpcyBhcmVhIGFuZCBhbW91bnQgb2YgYmF0aHJvb21zLiAKCiMjIEZlYXR1cmUgMwojIyMjIyBUcmVhdCBiYXRocm9vbXMgYXMgYSBmYWN0b3IgdmFyaWFibGUKCkkgZ290IGFuIElkZWEsIHdlIGhhdmUgYmF0aHJvb21zLCBhbmQgdGhleSBhcmUgaW4gcmFuZ2UgZnJvbSAxIHRvIDQuV2hhdCBpZiB3ZSB3aWxsIHRyZWF0IGVhY2ggYW1vdW50IG9mIGJhdGhyb29tcyBhcyBhIGZhY3RvciB2YXJpYWJsZS4KCiMjIyBMaW5lYXIgTW9kZWwKCgpgYGB7cn0KIyBjcmVhdGluZyBmYWN0b3IKZHRfaG91c2VzWywgY291bnRfYmF0aHJvb21zXzEgOj0gMF1bYmF0aHJvb21zID09IDEsIGNvdW50X2JhdGhyb29tc18xIDo9IDFdCmR0X2hvdXNlc1ssIGNvdW50X2JhdGhyb29tc18yIDo9IDBdW2JhdGhyb29tcyA9PSAyLCBjb3VudF9iYXRocm9vbXNfMiA6PSAxXQpkdF9ob3VzZXNbLCBjb3VudF9iYXRocm9vbXNfMyA6PSAwXVtiYXRocm9vbXMgPT0gMywgY291bnRfYmF0aHJvb21zXzMgOj0gMV0KZHRfaG91c2VzWywgY291bnRfYmF0aHJvb21zXzQgOj0gMF1bYmF0aHJvb21zID09IDQsIGNvdW50X2JhdGhyb29tc180IDo9IDFdCgoKcHJpY2VfbG1fMiA8LSBsbShmb3JtdWxhID0gcHJpY2UgfiBhcmVhICsgYmVkcm9vbXMgKyBob3R3YXRlcmhlYXRpbmcgKyBhaXJjb25kaXRpb25pbmcgKyBzdG9yaWVzICsgbWFpbnJvYWQgKyBwYXJraW5nICsgZnVybmlzaGluZ3N0YXR1cyArIGd1ZXN0cm9vbSArIGJhc2VtZW50ICsgcHJlZmFyZWEgKyBjb3VudF9iYXRocm9vbXNfMSArIGNvdW50X2JhdGhyb29tc18yICsgY291bnRfYmF0aHJvb21zXzMgKyBjb3VudF9iYXRocm9vbXNfNCArIHJvb21fY291bnQgKyBhcmVhX2xvZywgZGF0YSA9IGR0X2hvdXNlcykKCnN1bW1hcnkocHJpY2VfbG1fMikKbWVhbihwcmljZV9sbV8yJHJlc2lkdWFsc14yKQpgYGAKCgpgYGB7cn0KMS4wOTc3OThlKzEyIC0gMS4wODk3MDllKzEyCmBgYAoKWWVzISBOb3cgbW9kZWwgY3JlYXRlcywgMC4wMDgwODllKzEyIGxlc3MgTVNFLiBTbGlnaHQgaW1wcm92ZW1lbnQsIGJ1dCBzdGlsbCBncmVhdC4KCiMjIyBUcmVlIE1vZGVsCgojIyMgQ29tcGFyaW5nCgoKIyMgRmVhdHVyZSA0CiMjIyMgQWlyY29uZGl0aW9uaW5nIGFzIG9uZS1ob3QgZW5jb2RpbmcKClRoaXMgY291bGQgYmUgdGhlIGNhc2UsIGJlY2F1c2UgaWYgdGhpcyBkYXRhc2V0IHdhcyBnYXRoZXJlZCBmcm9tIGEgaG90IGFyZWEsIHdoZXJlIHN1bW1lciBpcyB1c3VhbGx5IHZlcnkgd2FybSwgYWlyY29uZGl0aW9uaW5nIGNvdWxkIGJlIHZlcnkgaW1wb3J0YW50IGZhY3Rvciwgd2hpbGUgYnV5aW5nIGEgaG91c2UgYW5kIGl0IHdpbGwgbWFrZSBwbGFjZSB3aXRoIGl0IG1vcmUgYXR0cmFjdGl2ZSwgYnV0IHdoZW4gdGhlcmUgaXMgbm90IGFueSwgaXQgY291bGQgbWFrZSBwbGFjZSB3b3JzZS4gCgpGb3IgZXhhbXBsZSwgaWYgdGhlcmUgaXMgYWlyY29uZGl0aW9uaW5nIHRoZXJlIGNvdWxkIGJlIEJldGEgPSB4LCBidXQgaWYgdGhlcmUgaXMgbm90LCBpdCBpcyBub3QgMCwgaXQgaXMgLXkgdmFsdWUgZnJvbSB0aGUgcHJvcGVydHkuCgpMZXQncyBkbyB0aGlzCgpgYGB7cn0KIyBjcmVhdGluZyBmYWN0b3JzCmR0X2hvdXNlc1ssIGFpcmNvbmRpdGlvbmluZ19pbnRlZ2VyIDo9IDBdW2FpcmNvbmRpdGlvbmluZyA9PSAneWVzJywgYWlyY29uZGl0aW9uaW5nX3llcyA6PSAxXQpkdF9ob3VzZXNbLCBhaXJjb25kaXRpb25pbmdfaW50ZWdlciA6PSAwXVthaXJjb25kaXRpb25pbmcgPT0gJ25vJywgYWlyY29uZGl0aW9uaW5nX25vIDo9IDFdCgojIGNhbGN1bGF0aW5nIGFuZCBydW5uaW5nIG1vZGVsCnByaWNlX2xtXzIgPC0gbG0oZm9ybXVsYSA9IHByaWNlIH4gKyBhcmVhICsgYmVkcm9vbXMgKyAgYWlyY29uZGl0aW9uaW5nICsgaG90d2F0ZXJoZWF0aW5nICsgc3RvcmllcyArIG1haW5yb2FkICsgcGFya2luZyArIGZ1cm5pc2hpbmdzdGF0dXMgKyBndWVzdHJvb20gKyBiYXNlbWVudCArIHByZWZhcmVhICsgY291bnRfYmF0aHJvb21zXzEgKyBjb3VudF9iYXRocm9vbXNfMiArIGNvdW50X2JhdGhyb29tc18zICsgY291bnRfYmF0aHJvb21zXzQgKyByb29tX2NvdW50ICsgYXJlYV9sb2cgKyBhaXJjb25kaXRpb25pbmdfeWVzICsgYWlyY29uZGl0aW9uaW5nX25vLCBkYXRhID0gZHRfaG91c2VzKQoKc3VtbWFyeShwcmljZV9sbV8yKQptZWFuKHByaWNlX2xtXzIkcmVzaWR1YWxzXjIpCgpgYGAKCioqKgoKCgoK